The Relationship of Tree Canopy to Urban Heat Effect and Inequality: A Case Study

Authors
Affiliation

Eleanor Lindsey

Colorado State University

Clara Jordan

Colorado State University

Sierra Champion

Colorado State University

ABSTRACT

This study investigates the relationship between tree canopy cover, urban temperature, and income levels in Los Angeles, California, with a focus on the Urban Heat Island (UHI) effect. As cities like Los Angeles continue to expand and global temperatures rise, urban areas are experiencing higher heat exposure, particularly in neighborhoods with limited green space. We aimed to test two main hypotheses: (1) that lower tree canopy coverage would be associated with higher temperatures, and (2) that lower canopy coverage would be correlated with lower income, based on the “luxury effect” described in previous research. Using data from 2003 to 2023, including average annual temperatures, tree canopy loss, and 2023 census tract median incomes, we conducted visual and statistical analyses to explore these relationships. While visualizations suggested a strong association between canopy loss and temperature, the Pearson correlation test did not yield statistically significant results. Additionally, our analysis found no measurable correlation between tree canopy and income, although this may be due to limitations in the data structure. Despite these challenges, our findings reinforce the importance of urban tree canopies in regulating temperature and highlight the need for more targeted, high-resolution data to understand environmental inequalities in urban heat exposure fully.

INTRODUCTION & HYPOTHESIS

Due to the effects of climate change, large cities like Los Angeles, California, are expected to see regularly increasing temperatures (CaliforniaClimateChange?). When looking at a heat map, an odd phenomenon shows higher temperatures within city limits than in rural areas on the border. This unequal heating is called the Urban Heat Island (UHI) effect (HeatIslandHealthEffects?). This is due to reduced vegetation cover, increased concrete, disruptive buildings, and human activities (HeatIslandHealthEffects?). This effect is projected to worsen as more of the world’s population migrates to urban environments, cities continue to grow, and global temperatures rise (HeatIslandTrends?). Areas already impacted by the UHI effect will “bear the brunt” of intensified heat waves brought on by climate change (HeatIslandTrends?).

From the 1980s onwards, environmental movements began to advocate for increased green space and sustainability in urban planning (LAHeatIsland?). Initiatives promoting tree canopy restoration, urban greening, and park creation have been prioritized to mitigate the impacts of climate change and improve public health (CoolingEffectsOfTrees?). Notably, the city has recognized the need to address the disproportionate effects of urban heat on low-income communities (UrbanGreenSpaceInequality?). However, despite these efforts, many neighborhoods still struggle with heat exposure and lack adequate vegetation, emphasizing the need for continued research and policy reform to ensure equitable access to urban green spaces (UrbanGreenSpaceInequality?).

During this period, a growing awareness of environmental issues began to emerge (HeatIslandTrends?). Land-use planning and urban development often neglect the integration of green spaces, leading to stark inequalities within the city (HeatIslandTrends?). Areas predominantly inhabited by low-income communities and people of color were typically underserved in terms of infrastructure and green spaces, resulting in disparities in health outcomes and environmental benefits compared to wealthier neighborhoods (UrbanGreenSpaceInequality?).

Even before considering climate change, the urban heat island effect can have severe impacts, including changing urban ecosystems and biodiversity and exacerbating human health issues (HeatIslandHealthEffects?). Increasing urban temperatures can lead to health impacts, including increased exposure to extreme heat, air pollution, heat stroke, cardiorespiratory mortality, kidney disease, and mental illness, among others (HeatIslandHealthEffects?).

One of the primary benefits of urban green spaces is their ability to cool the surrounding environment (CoolingEffectsOfTrees?). Trees and vegetation play a vital role in reducing temperatures through evapotranspiration, where water is absorbed by the roots and released as water vapor from the leaves. This natural cooling effect can significantly lower urban temperatures, helping to combat the extreme heat conditions associated with the Urban Heat Island (UHI) effect (HeatIslandTrends?). Research shows that areas with denser tree canopies can experience temperature reductions of five to ten degrees Fahrenheit compared to surrounding asphalt or concrete regions (CoolingEffectsOfTrees?).

Additionally, urban green spaces help improve air quality (UrbanGreenSpacesAirQuality?). Trees act as natural air filters by absorbing pollutants and providing oxygen. Green spaces have been linked to reductions in air pollutants like carbon dioxide, nitrogen dioxide, and particulate matter, thereby promoting healthier urban environments (UrbanGreenSpacesAirQuality?). Studies indicate that residents living near green spaces report better physical and mental health outcomes, likely due to increased opportunities for physical activity, social interaction, and psychological restoration provided by nature (HealthBenefitsOfGreenSpaces?).

Los Angeles is a prime example of the UHI effect as it is a large city housing millions of people with a tall, dense skyline. Asphalt and blacktop trap a significant amount of heat, making the temperatures within the five counties that make up the metropolis hotter than those in surrounding rural areas (LAHeatIsland?). In this study, we aim to investigate how the urban heat island effect in Los Angeles is related to tree cover and income. We intend to explore if tree cover and temperature are related, and how this may be spatially associated with income. We were inspired by Schell et al. 2020 b’s thoughts on the luxury effect, the idea that urban biodiversity is positively correlated with neighborhood wealth (RacismInUrbanEnvironments?). This curiosity raised questions like: How will climate change exacerbate urban heat? Can we identify neighborhoods in Los Angeles at higher risk of urban heat health impacts? To investigate this question, the objective of our study was to answer the following question: Does decreased tree canopy contribute to increased urban temperatures and correlate with median income in Los Angeles? Our hypotheses include: A) As tree canopy decreases, the mean annual temperature will increase because of the urban heat island effect, and B) As tree canopy decreases, the mean annual income will decrease due to the historical marginalization of low-income home communities in neighborhoods with little green space, per the luxury effect (RacismInUrbanEnvironments?)

To explore these hypotheses, we obtained Los Angeles-centric data on yearly average temperatures in Los Angeles from 2013 - 2023 (TempData?), tree canopy cover loss from 2001 - 2023 (LATreeLoss?), and median income for 2023 (MedianIncomeCensusTract?). With R as our analysis platform, we combined these datasets and visually analyzed the relationships between our three variables using scatter plots. All analyses were performed using R version 2024.12.1.

METHODS

We compiled datasets on air temperature records, median income, and land cover (TempData?; MedianIncomeCensusTract?; LATreeLoss?). Our question pertains to the heat island effect, which is correlated to the change in air temperature within city limits (HeatIslandTrends?). Moreover, we analyzed the relationship between median household income and the urban heat island (UHI) effect. We first cleaned our three data sets utilizing a dplyr verb pipeline in R to perform filters, selections, and omissions of NA values. These data sets had different temporal boundaries, with the income data spanning a five-year range and the other two sources covering 2003 to 2023. We addressed this by limiting our income data analysis from 2018 to 2023 and disregarding previous years. 

The second step to preparing our data for analysis was to join our data sets (temperature, land cover, and income) using a left join on the common “year” column. This was done to create a unified data frame to analyze. This joined data frame was then analyzed to answer our question: Does decreased tree canopy contribute to increased urban temperatures and correlate with median income? To answer this question, we tested both hypotheses in the following manner: we used a Pearson correlation test to examine the correlation between tree canopy and mean annual temperature in Los Angeles, and we also used the correlation function to test the correlation between tree canopy and mean annual income in Los Angeles. To bolster our statistical tests, we also explored the relationships between our variables by creating visualizations that show trends in temperature and tree canopy loss over the project’s temporal span. To compare tree canopy cover loss with temperature changes, we disaggregated the temperature data by two locations: one urban and one rural, to see if the UHI effects are more severe in urban environments. This would support Hypothesis A. 

These methods helped us answer our question by revealing potential relationships between tree canopy, mean annual temperature, and income. This allowed us to draw conclusions about the extent of the UHI effect in urban and rural areas of Los Angeles and to relate temperature trends to the loss of tree canopy cover over time. Moreover, the results incorporating the analysis of income trends are inconclusive, as shown in the following section.

RESULTS

This study found that as the tree canopy decreases, the mean annual temperature increases, and found no relationship between tree canopy and median income. To test Hypothesis A, as the tree canopy decreases, the mean annual temperature is expected to increase due to the urban heat island effect, we first compared urban and suburban temperature data from downtown Los Angeles and the Malibu Hills, respectively. This comparison revealed a stark contrast between suburban and urban temperatures across the twenty-year time series from 2003 to 2023. Consistently throughout this time series, the Malibu Hills temperature records were lower than those for downtown Los Angeles, as summarized in Figure 1 below. 

library(here)
library(dplyr)
library(readr)
library(ggplot2)
library(lubridate)
library(tidyverse)
library(plotly)
temp_data1<-read_csv(here("data/2003-2005 Temperature Data(in).csv"))
temp_data2<-read_csv(here("data/2006-2012 Temperature Data(in).csv"))
temp_data3<-read_csv(here("data/2013-2019 Temperature Data(in).csv"))
temp_data4<-read_csv(here("data/2020-2023 Temperature Data(in).csv"))
temperature_data<-bind_rows(temp_data1, temp_data2, temp_data3, temp_data4)%>%
  rename('Year'='DATE')%>%
  select(-STATION)%>%
  na.omit()

temperature_data$Year <- as.numeric(temperature_data$Year)

str(temperature_data$Year)
 num [1:745] 2003 2004 2005 2003 2004 ...
downtown_stations <- c("LOS ANGELES DOWNTOWN USC, CA US", "MALIBU HILLS CALIFORNIA, CA US")
temperature_data_filtered <- temperature_data %>%
  filter(NAME %in% downtown_stations)

station_labels<- c("LOS ANGELES DOWNTOWN USC, CA US"="Downtown LA","MALIBU HILLS CALIFORNIA, CA US"='Malibu Hills' )

Temperature_plot <- ggplot(temperature_data_filtered, aes(x = Year, y = TAVG, color = NAME)) +
  geom_line() +
  scale_x_continuous(
    breaks = seq(2003, 2023, by = 4), 
    labels = seq(2003, 2023, by = 4)) + 
  scale_color_manual(values = c("dodgerblue", "firebrick"), labels = station_labels) +
  labs(
    title = "Average Temperature in Los Angeles Downtown Stations",
    x = "Year",
    y = "Temperature (°F)"
  ) +
  theme_minimal(base_size = 14)

ggplotly(Temperature_plot)
Source: Article Notebook

Figure 1: Yearly average temperature (degrees Fahrenheit) for USGS weather stations in downtown Los Angeles (blue line) and one within the Malibu Hills area (red line). The data ranges from 2003 to 2023 recording trends of increased temperature spikes in Downtown in the years of 2006, 2008, 2015, and 2016. The Malibu Hills station recorded lower spikes in temperature during similar years.

In Figure 1, the temperature data is categorized by a yearly average temperature in LA in degrees Fahrenheit. These results confirm the existence of the urban heat island (UHI), characterized by higher and more variable temperatures in urban environments compared to suburban environments. This study then incorporated tree canopy loss as a variable influencing temperature. Figure 2 visualizes trends in tree canopy loss in Los Angeles over the same time series considered for temperature data. We can see continual tree canopy loss over the twenty years considered, characteristic of urban environments which continue to develop as population increases over time. Spikes indicate periods of elevated tree canopy loss, such as in 2009 and 2020. 

tree_canopy_data<-read_csv(here("data/Tree_canopy_loss/treecover_loss__ha.csv"))%>%
  select(-adm1, -adm2, -iso)%>%
  rename('Year'='umd_tree_cover_loss__year','Tree_loss_ha'='umd_tree_cover_loss__ha' )
Tree_loss_plot<-ggplot(tree_canopy_data, aes(x=Year, y=Tree_loss_ha))+
  geom_area(fill="forestgreen",color="forestgreen")+
  labs(x="Year", y="Tree Canopy Loss (ha)", title="Tree Canopy Loss in LA per Year")

Tree_loss_plot

Figure 2: Yearly tree canopy loss (hectares) in Los Angeles California from 2003 to 2023. The green shaded region shows how much canopy growth was lost per year including natural causes (wildfires, disease) and human based activities (urbanization, logging). The largest spike in the data occurred in 2009.

Given the hypothesized relationship between tree canopy loss and temperature due to the UHE, it is no surprise that this study also found a very strong visual correlation between these two variables. As seen below in Figure 3, temperature trends follow those of tree canopy loss in Los Angeles. While the spikes and dips are not to the same order of magnitude between these two variables, it is evident that increases in tree canopy cover loss impact temperature. For example, in 2009, there was a significant decrease in tree canopy cover of over 20,000 hectares. This large spike can be attributed to the Station fire in California that burned over 60,000 hectares of National Forest (Station-Fire?). After this spike, temperature readings in both downtown Los Angeles and Malibu Hills increased steadily for approximately the next five years. It is important to note that throughout these changes in tree canopy cover, the variation in Malibu Hills temperature records is less than that of downtown Los Angeles, further demonstrating the regulatory and cooling impact of tree canopy in developed environments

median_income_data <- read_csv(here("data/Median_Income_and_AMI_(census_tract).csv"))%>%
  select(-tract, -sup_dist, -ESRI_OID, -Shape__Area, -Shape__Length )
combined_Temp_tree_data<-left_join(tree_canopy_data,temperature_data, by='Year')%>%
  rename('Temp_average'='TAVG', 'Station_name'='NAME')
combined_income_tree_temp <- cross_join(combined_Temp_tree_data, median_income_data)
combined_income_tree_temp <- combined_income_tree_temp[-c(1:4990), ]

station_labels <- c(
  "LOS ANGELES DOWNTOWN USC, CA US" = "Downtown LA",
  "MALIBU HILLS CALIFORNIA, CA US" = "Malibu Hills")

filtered_data_combined <- combined_income_tree_temp %>%
  filter(Station_name %in% c("LOS ANGELES DOWNTOWN USC, CA US", "MALIBU HILLS CALIFORNIA, CA US"))


scale_factor <- max(combined_income_tree_temp$Tree_loss_ha, na.rm = TRUE) /
                max(combined_income_tree_temp$Temp_average, na.rm = TRUE)

# Plot
combined_plot <- ggplot(filtered_data_combined, aes(x = Year)) +
  # Tree loss line
  geom_line(aes(y = Tree_loss_ha), color = "forestgreen", size = 1) +
  
  # Temperature lines by station
  geom_line(aes(y = Temp_average * scale_factor, color = Station_name), size = 1) +

  scale_x_continuous(
    breaks = seq(2003, 2023, by = 2),
    labels = seq(2003, 2023, by = 2)
  ) +
  scale_y_continuous(
    name = "Tree Canopy Loss (ha)",
    sec.axis = sec_axis(~./scale_factor, name = "Average Temperature (°F)")
  ) +
  scale_color_manual(
    values = c("LOS ANGELES DOWNTOWN USC, CA US" = "dodgerblue",
               "MALIBU HILLS CALIFORNIA, CA US" = "firebrick"),
    labels = station_labels,
    name = "Weather Station"
  ) +
  theme_minimal() +
  theme(
    axis.title.y.left = element_text(color = "forestgreen"),
    axis.text.y.left = element_text(color = "forestgreen"),
    axis.title.y.right = element_text(color = "black"),
    axis.text.y.right = element_text(color = "black"),
    legend.position = "bottom"
  ) +
  labs(
    title = "Tree Loss and Average Temperature by Year",
    x = "Year"
  )
ggplotly(combined_plot)

Figure 3: A combined graph of the yearly average temperatures and canopy loss. USGS weather stations reported in degrees Fahrenheit and tree canopy data reported in hectares. The downtown Los Angeles weather station (blue line) depicts higher spikes in temperature than the Malibu Hills weather station (red line). The overall trend in temperature matches the spikes in tree canopy loss (green line). It is of note that the largest spike occurring in 2009 was the result of the station fire in one of the California National Parks.

This relationship between tree canopy loss and temperature, however, was surprisingly not confirmed by a correlation test performed between the two variables. A Pearson’s product-moment correlation test revealed a t-value of approximately 0.69 and a p-value of roughly 0.48. T-values represent the “test statistic,” which varies within a plausible range of 0 to 1, with 0 representing no correlation and 1 representing perfect correlation (which is rare). Thus, a t-value of 0.69 between tree canopy loss and temperature indicates a correlation between the two that is not particularly strong. This relative weakness of correlation is confirmed by the p-value of 0.48, which, being greater than the critical value of 0.05, indicates a statistically insignificant result. The failure of this statistical test does not mean that there is no relationship between these variables. The Pearson’s product-moment correlation test involves several assumptions that would have to be further investigated to produce a solid conclusion on whether Hypothesis A is rejected or not. These assumptions include variable linearity, homogeneity of variance, and the absence of outlier values. The vast difference in the order of magnitudes between tree canopy loss and temperature may have influenced the results of the correlation test. 

combined_Temp_tree_data<-left_join(tree_canopy_data,temperature_data, by='Year')%>%
  rename('Temp_average'='TAVG', 'Station_name'='NAME')

Combined_cor<-cor.test(combined_Temp_tree_data$Temp_average, combined_Temp_tree_data$Tree_loss_ha)

print(Combined_cor)

    Pearson's product-moment correlation

data:  combined_Temp_tree_data$Temp_average and combined_Temp_tree_data$Tree_loss_ha
t = 0.69235, df = 743, p-value = 0.4889
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.04652181  0.09704331
sample estimates:
       cor 
0.02539167 

Figure 4: A correlation test between average temperature and yearly tree canopy loss. Returning a P-value of 0.4889 shows a lack of statistical evidence that there is strong correlation. Using a critical value of 0.05 the t-value concludes that the null hypothesis has failed to be rejected signifying the correlation test is not statistically significant.

In addition to exploring Hypothesis A, this study also investigated Hypothesis B, as tree canopy decreases, mean annual income will decrease due to the historical marginalization of low-income home communities in neighborhoods with little green space, in accordance with ideas brought forward by the luxury effect. To test this hypothesis, as outlined in the Methods section, we performed yet another correlation test. This correlation test between tree canopy loss and median income yielded a curious result, as shown in Figure 5 below. 

median_income_data <- read_csv(here("data/Median_Income_and_AMI_(census_tract).csv"))%>%
  select(-tract, -sup_dist, -ESRI_OID, -Shape__Area, -Shape__Length )
tree_income_cor<- cor(combined_income_tree_temp[,c("Tree_loss_ha", "med_hh_income")])


print(tree_income_cor)
              Tree_loss_ha med_hh_income
Tree_loss_ha             1            NA
med_hh_income           NA             1

Figure 5: A correlation test between the average household median income and yearly tree canopy loss. The results returned inconclusive results with NA values in correlation between the variables. The data has a five year time span and comes from a census done in California.

A result of “NA” for correlating tree canopy loss to median income may have several explanations. This result can occur if there was a data cleaning error and the correlation test was performed on data with “Not Available” values. However, this study corrected for the presence of NAs during data cleaning. Other possible causes could include that our variables had minimal variance, data with too few observations, or data type issues. Much like the NA value issue, this study, when carrying out its method, confirmed that these issues were corrected. Finally, the last explanation is that R, when computing the correlation, created an output of NA to indicate that the two variables being compared were too different to compare. From these, statistically, this study found that Hypothesis B should be rejected. However, the authors of this study disagree with this statistical conclusion due to potential assumptions in the correlation test and evidence in existing literature that suggests a relationship between tree canopy and income. This will be discussed in the “Discussion and Conclusions” section below. 

DISCUSSION & CONCLUSIONS

Our study examined the relationship between tree canopy cover, temperature, and income in Los Angeles, focusing on how the Urban Heat Island (UHI) effect may impact different communities. While our visual data showed clear trends, such as how Downtown LA is consistently hotter than suburban areas like Malibu Hills, and how tree canopy loss has increased over time, the results from our statistical tests were more mixed.

For Hypothesis A, we expected to see a strong connection between decreasing tree canopy and increasing temperature. Visually, that relationship is evident here [Figure 3]. For example, years with significant spikes in canopy loss, like 2009, were followed by steady increases in average temperature. However, when we ran a Pearson correlation test, the results weren’t statistically significant [Figure 4]. We think this might be because the correlation test assumes a linear relationship and consistent variance, which may not apply to our data. It also doesn’t capture delayed effects, which could be important here, like temperature rising in the years after trees are removed.

Hypothesis B, which suggested that lower tree canopy would correlate with lower income, ended up giving us an “NA” in our correlation test [Figure 5]. This could be due to several factors, such as mismatched data formats or variables that don’t align well. But from everything we’ve read and seen in class, it’s clear that tree cover and green space are not evenly distributed in LA, and that environmental inequality is a real issue. So, while the data didn’t support Hypothesis B statistically, we’re hesitant to reject it completely without further investigation.

There were some limitations to our analysis. One of the biggest was the mismatch in spatial and temporal scales across datasets. For example, our temperature data was organized by year and station, but our income data was tied to census tracts. That made it difficult to compare or join the datasets in a meaningful way directly. Also, some of the environmental patterns we were looking at might take years to play out—something our current time frame might not fully capture.

Despite these challenges, the project demonstrates how useful it can be to combine different types of data to examine environmental justice issues. It also highlighted the importance of visualizations in helping us see patterns that statistics might miss on their own. From a data science perspective, it was a good reminder that correlation doesn’t always tell the whole story, especially when dealing with complex systems like climate, urban design, and social inequality.

If we continue this research, we would want to use more localized data, such as satellite images of tree cover by neighborhood, and include population density or even heat-related health data to deepen the analysis. It would also be helpful to explore other statistical models that can handle non-linear or lagged relationships.

Overall, even though our hypotheses weren’t fully supported statistically, our findings still suggest that tree canopy plays a significant role in regulating temperature, and that communities with limited access to green space may face greater climate-related risks. As the climate continues to warm, making cities cooler and more livable should be a top priority, especially for neighborhoods that have historically been underserved.

References

Jet Propulsion Laboratory. (n.d.). NASA maps key heat wave differences in Southern California. NASA. https://www.nasa.gov/centers-and-facilities/jpl/nasa-maps-key-heat-wave-differences-in-southern-california/

Global Heat Watch. (n.d.). [Title not provided]. https://www.globalforestwatch.org/dashboards/country/USA/5/19/?category=forest-change&location=WyJjb3VudHJ5IiwiVVNBIiwiNSIsIjE5Il0%3D

Weng, Q. (2009). Thermal infrared remote sensing for urban climate and environmental studies: Methods, applications, and trends. ISPRS Journal of Photogrammetry and Remote Sensing, 64(4), 335–344. https://doi.org/10.1016/j.isprsjprs.2009.03.007

Hsu, A., Sheriff, G., Chakraborty, T., & Manya, D. (2021). Disproportionate exposure to urban heat island intensity across major U.S. cities. Nature Communications, 12, Article 2721. https://doi.org/10.1038/s41467-021-22799-5

Chen, B., Wu, S., Song, Y., Webster, C., Xu, B., & Gong, P. (2022). Contrasting inequality in human exposure to greenspace between cities of Global North and Global South. Nature Communications, 13, Article 4581. https://doi.org/10.1038/s41467-022-32258-4

Ziter, C. D., Pedersen, E. J., Kucharik, C. J., & Turner, M. G. (2019). Scale-dependent interactions between tree canopy cover and impervious surfaces reduce daytime urban heat during summer. Proceedings of the National Academy of Sciences, 116(15), 7575–7580. https://doi.org/10.1073/pnas.1817561116

Nowak, D. J., Hirabayashi, S., Bodine, A., & Greenfield, E. (2014). Tree and forest effects on air quality and human health in the United States. Environmental Pollution, 193, 119–129. https://doi.org/10.1016/j.envpol.2013.05.050

van den Bosch, M., Oude Groeniger, J., de Jong, T., & Kardan, O. (2017). Longitudinal associations of green space with mental health in urban residents: 10-year follow-up of the GLOBE study. Environment International, 108, 293–302. https://doi.org/10.1016/j.envint.2016.10.019

Los Angeles County Department of Public Health. (n.d.). Median income and AMI census tract. https://data.lacounty.gov/Economy/Median-Income-and-AMI-census-tract/4ihp-bh6h

National Weather Service Forecast Office. (n.d.). Station Fire: A meteorological perspective. https://www.weather.gov/media/wrh/online_publications/talite/talite1002-1.pdf

Schell, C., Dyson, K., Fuentes, T., Des Roches, S., Harris, N., Miller, D., Woelfle-Erskine, C., & Lambert, M. (2020). Ecological and evolutionary consequences of systemic racism in urban ecosystems. Science, 369(6510), eaay4497. https://doi.org/10.1126/science.aay4497